%matplotlib inline
import matplotlib.pyplot as plt
from datetime import datetime
import os
import subprocess
import requests
import boto3
import pandas as pd
import numpy as np
import xarray as xr
import rasterio as rio
from rasterio.session import AWSSession
from rasterio.plot import show
import rioxarray
import geopandas
import pyproj
from pyproj import Proj
from shapely.ops import transform
import geoviews as gv
from cartopy import crs
import hvplot.xarray
import holoviews as hv
gv.extension('bokeh', 'matplotlib')Direct S3 Data Access
Summary
In the previous exercises we exercise we search for and discovered cloud data assets that met certain search criteria (i.e., interests with our region of interest and for a specified date range). In this exercise we will leverage the data access links to show how to access HLS data directly from S3.
HTTPS vs s3
NASA Eartdata cloud provides two pathways for accessing data from the cloud. The first is via HTTPS. The other is through direct S3 bucket access.
Below are some considerations when selecting … Below are some advantages to choosing one or the other…
Objective
- Configure our notebook environment and retrieve temporary S3 credentials for in-region direct S3 bucket access
- Access a single HLS file
- Access and clip an HLS file to a region of interest
- Create an HLS time series data array
Let’s get started!
Import Required Packages
Configure Local Environment and Get Temporary Credentials
To perform direct S3 data access one needs to acquire temporary S3 credentials. The credentials give users direct access to S3 buckets in NASA Earthdata Cloud. AWS credentials should not be shared, so take precautions when using them in notebooks our scripts. Note, these temporary credentials are valid for only 1 hour. For more information regarding the temporary credentials visit https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME. A netrc file is required to aquire these credentials. Use the NASA Earthdata Authentication to create a netrc file in your home directory.
s3_cred_endpoint = 'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials'def get_temp_creds():
temp_creds_url = s3_cred_endpoint
return requests.get(temp_creds_url).json()temp_creds_req = get_temp_creds()
#temp_creds_req # !!! BEWARE, removing the # on this line will print your temporary S3 credentials.Insert the credentials into our boto3 session and configure our rasterio environment for data access
Create a boto3 Session object using your temporary credentials. This Session is used to pass credentials and configuration to AWS so we can interact wit S3 objects from applicable buckets.
session = boto3.Session(aws_access_key_id=temp_creds_req['accessKeyId'],
aws_secret_access_key=temp_creds_req['secretAccessKey'],
aws_session_token=temp_creds_req['sessionToken'],
region_name='us-west-2')For this exercise, we are going to open up a context manager for the notebook using the rasterio.env module to store the required GDAL and AWS configurations we need to access the data in Earthdata Cloud. While the context manager is open (rio_env.__enter__()) we will be able to run the open or get data commands that would typically be executed within a with statement, thus allowing us to more freely interact with the data. We’ll close the context (rio_env.__exit__()) at the end of the notebook.
GDAL environment variables must be configured to access Earthdata Cloud data assets. Geospatial data access Python packages like rasterio and rioxarray depend on GDAL, leveraging GDAL’s “Virtual File Systems” to read remote files. GDAL has a lot of environment variables that control it’s behavior. Changing these settings can mean the difference being able to access a file or not. They can also have an impact on the performance.
rio_env = rio.Env(AWSSession(session),
GDAL_DISABLE_READDIR_ON_OPEN='EMPTY_DIR',
GDAL_HTTP_COOKIEFILE=os.path.expanduser('~/cookies.txt'),
GDAL_HTTP_COOKIEJAR=os.path.expanduser('~/cookies.txt'))
rio_env.__enter__()<rasterio.env.Env at 0x7fdb997f6890>
Read in S3 Links
In the CMR-STAC API tutorial we saved off multiple text file containing links, both HTTPS and S3 links, to Harmonized Landsat Sentinel-2 (HLS) cloud data assets. We will now read in one of those file and show how to access those data assets.
List the available files in the data directory
for f in os.listdir('./data'):
print(f)S3_T12TGF_B04_Links.txt
nir_stack.vrt
S3_T12TGF_NIR_VSI_Links.txt
S3_T12TGF_Fmask_Links.txt
HTTPS_T12TGF_B8A_Links.txt
ne_w_agfields.geojson
red_stack.vrt
.ipynb_checkpoints
S3_T12TGF_RED_VSI_Links.txt
S3_T12TGF_B02_Links.txt
HTTPS_T12TGF_B05_Links.txt
S3_T12TGF_B05_Links.txt
fmask_stack.vrt
HTTPS_T12TGF_Fmask_Links.txt
S3_T12TGF_B8A_Links.txt
S3_T12TGF_FMASK_VSI_Links.txt
HTTPS_T12TGF_B04_Links.txt
HTTPS_T12TGF_B02_Links.txt
We will safe our list of links and a single link as Python objects for use later.
s3_links = open('./data/S3_T12TGF_B04_Links.txt').read().splitlines()
s3_links[‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021133T172406.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021133T173859.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021140T173021.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021140T172859.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021145T172901.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021156T173029.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021163T173909.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021165T172422.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021165T172901.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021185T172901.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021188T173037.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021190T172859.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021198T173911.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021200T172859.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021203T173909.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021204T173042.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021215T172901.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021220T173049.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021229T172441.v1.5.B04.tif’]
s3_link = s3_links[0]
s3_link‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021133T172406.v1.5.B04.tif’
Read in a single HLS file
We’ll access the HLS S3 object using the rioxarray Python package. The package is an extension of xarray and rasterio, allowing users to read in and interact with geospatial data using xarray data structures. We will also be leveraging the tight integration between xarray and dask to lazily read in data via the chunks parameter. This allows us to connect to the HLS S3 object, reading only metadata, an not load the data into memory until we request it via the loads() function.
hls_da = rioxarray.open_rasterio(s3_link, chuncks=True)
hls_da<xarray.DataArray (band: 1, y: 3660, x: 3660)>
[13395600 values with dtype=int16]
Coordinates:
* band (band) int64 1
* x (x) float64 7e+05 7e+05 7e+05 ... 8.097e+05 8.097e+05 8.097e+05
* y (y) float64 4.6e+06 4.6e+06 4.6e+06 ... 4.49e+06 4.49e+06
spatial_ref int64 0
Attributes:
_FillValue: -9999.0
scale_factor: 0.0001
add_offset: 0.0
long_name: Red- band: 1
- y: 3660
- x: 3660
- ...
[13395600 values with dtype=int16]
- band(band)int641
array([1])
- x(x)float647e+05 7e+05 ... 8.097e+05 8.097e+05
array([699975., 700005., 700035., ..., 809685., 809715., 809745.])
- y(y)float644.6e+06 4.6e+06 ... 4.49e+06
array([4600005., 4599975., 4599945., ..., 4490295., 4490265., 4490235.])
- spatial_ref()int640
- crs_wkt :
- PROJCS["UTM Zone 13, Northern Hemisphere",GEOGCS["Unknown datum based upon the WGS 84 ellipsoid",DATUM["Not_specified_based_on_WGS_84_spheroid",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
- semi_major_axis :
- 6378137.0
- semi_minor_axis :
- 6356752.314245179
- inverse_flattening :
- 298.257223563
- reference_ellipsoid_name :
- WGS 84
- longitude_of_prime_meridian :
- 0.0
- prime_meridian_name :
- Greenwich
- geographic_crs_name :
- Unknown datum based upon the WGS 84 ellipsoid
- horizontal_datum_name :
- Not_specified_based_on_WGS_84_spheroid
- projected_crs_name :
- UTM Zone 13, Northern Hemisphere
- grid_mapping_name :
- transverse_mercator
- latitude_of_projection_origin :
- 0.0
- longitude_of_central_meridian :
- -105.0
- false_easting :
- 500000.0
- false_northing :
- 0.0
- scale_factor_at_central_meridian :
- 0.9996
- spatial_ref :
- PROJCS["UTM Zone 13, Northern Hemisphere",GEOGCS["Unknown datum based upon the WGS 84 ellipsoid",DATUM["Not_specified_based_on_WGS_84_spheroid",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
- GeoTransform :
- 699960.0 30.0 0.0 4600020.0 0.0 -30.0
array(0)
- _FillValue :
- -9999.0
- scale_factor :
- 0.0001
- add_offset :
- 0.0
- long_name :
- Red
When GeoTIFFS/Cloud Optimized GeoTIFFS are read in, a band coordinate variable is automatically created (see the print out above). In this exercise we will not use that coordinate variable, so we will remove it using the squeeze() function to avoid confusion.
hls_da = hls_da.squeeze('band', drop=True)
hls_da<xarray.DataArray (y: 3660, x: 3660)>
[13395600 values with dtype=int16]
Coordinates:
* x (x) float64 7e+05 7e+05 7e+05 ... 8.097e+05 8.097e+05 8.097e+05
* y (y) float64 4.6e+06 4.6e+06 4.6e+06 ... 4.49e+06 4.49e+06
spatial_ref int64 0
Attributes:
_FillValue: -9999.0
scale_factor: 0.0001
add_offset: 0.0
long_name: Red- y: 3660
- x: 3660
- ...
[13395600 values with dtype=int16]
- x(x)float647e+05 7e+05 ... 8.097e+05 8.097e+05
array([699975., 700005., 700035., ..., 809685., 809715., 809745.])
- y(y)float644.6e+06 4.6e+06 ... 4.49e+06
array([4600005., 4599975., 4599945., ..., 4490295., 4490265., 4490235.])
- spatial_ref()int640
- crs_wkt :
- PROJCS["UTM Zone 13, Northern Hemisphere",GEOGCS["Unknown datum based upon the WGS 84 ellipsoid",DATUM["Not_specified_based_on_WGS_84_spheroid",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
- semi_major_axis :
- 6378137.0
- semi_minor_axis :
- 6356752.314245179
- inverse_flattening :
- 298.257223563
- reference_ellipsoid_name :
- WGS 84
- longitude_of_prime_meridian :
- 0.0
- prime_meridian_name :
- Greenwich
- geographic_crs_name :
- Unknown datum based upon the WGS 84 ellipsoid
- horizontal_datum_name :
- Not_specified_based_on_WGS_84_spheroid
- projected_crs_name :
- UTM Zone 13, Northern Hemisphere
- grid_mapping_name :
- transverse_mercator
- latitude_of_projection_origin :
- 0.0
- longitude_of_central_meridian :
- -105.0
- false_easting :
- 500000.0
- false_northing :
- 0.0
- scale_factor_at_central_meridian :
- 0.9996
- spatial_ref :
- PROJCS["UTM Zone 13, Northern Hemisphere",GEOGCS["Unknown datum based upon the WGS 84 ellipsoid",DATUM["Not_specified_based_on_WGS_84_spheroid",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
- GeoTransform :
- 699960.0 30.0 0.0 4600020.0 0.0 -30.0
array(0)
- _FillValue :
- -9999.0
- scale_factor :
- 0.0001
- add_offset :
- 0.0
- long_name :
- Red
Plot the HLS S3 object
hls_da.hvplot.image(x='x', y='y', cmap='fire', rasterize=True, width=800, height=600, colorbar=True) # colormaps -> https://holoviews.org/user_guide/Colormaps.htmlUnable to display output for mime type(s):
We can print out the data value as a numpy array by typing .values
hls_da.valuesarray([[-9999, -9999, -9999, …, 1527, 1440, 1412], [-9999, -9999, -9999, …, 1493, 1476, 1407], [-9999, -9999, -9999, …, 1466, 1438, 1359], …, [-9999, -9999, -9999, …, 1213, 1295, 1159], [-9999, -9999, -9999, …, 1042, 1232, 1185], [-9999, -9999, -9999, …, 954, 1127, 1133]], dtype=int16)
Up to this point, we have not saved anything but metadata into memory. To save or load the data into memory we can call the .load() function.
hls_da_data = hls_da.load()
hls_da_data<xarray.DataArray (y: 3660, x: 3660)>
array([[-9999, -9999, -9999, ..., 1527, 1440, 1412],
[-9999, -9999, -9999, ..., 1493, 1476, 1407],
[-9999, -9999, -9999, ..., 1466, 1438, 1359],
...,
[-9999, -9999, -9999, ..., 1213, 1295, 1159],
[-9999, -9999, -9999, ..., 1042, 1232, 1185],
[-9999, -9999, -9999, ..., 954, 1127, 1133]], dtype=int16)
Coordinates:
* x (x) float64 7e+05 7e+05 7e+05 ... 8.097e+05 8.097e+05 8.097e+05
* y (y) float64 4.6e+06 4.6e+06 4.6e+06 ... 4.49e+06 4.49e+06
spatial_ref int64 0
Attributes:
_FillValue: -9999.0
scale_factor: 0.0001
add_offset: 0.0
long_name: Red- y: 3660
- x: 3660
- -9999 -9999 -9999 -9999 -9999 -9999 ... 1676 1486 1112 954 1127 1133
array([[-9999, -9999, -9999, ..., 1527, 1440, 1412], [-9999, -9999, -9999, ..., 1493, 1476, 1407], [-9999, -9999, -9999, ..., 1466, 1438, 1359], ..., [-9999, -9999, -9999, ..., 1213, 1295, 1159], [-9999, -9999, -9999, ..., 1042, 1232, 1185], [-9999, -9999, -9999, ..., 954, 1127, 1133]], dtype=int16) - x(x)float647e+05 7e+05 ... 8.097e+05 8.097e+05
array([699975., 700005., 700035., ..., 809685., 809715., 809745.])
- y(y)float644.6e+06 4.6e+06 ... 4.49e+06
array([4600005., 4599975., 4599945., ..., 4490295., 4490265., 4490235.])
- spatial_ref()int640
- crs_wkt :
- PROJCS["UTM Zone 13, Northern Hemisphere",GEOGCS["Unknown datum based upon the WGS 84 ellipsoid",DATUM["Not_specified_based_on_WGS_84_spheroid",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
- semi_major_axis :
- 6378137.0
- semi_minor_axis :
- 6356752.314245179
- inverse_flattening :
- 298.257223563
- reference_ellipsoid_name :
- WGS 84
- longitude_of_prime_meridian :
- 0.0
- prime_meridian_name :
- Greenwich
- geographic_crs_name :
- Unknown datum based upon the WGS 84 ellipsoid
- horizontal_datum_name :
- Not_specified_based_on_WGS_84_spheroid
- projected_crs_name :
- UTM Zone 13, Northern Hemisphere
- grid_mapping_name :
- transverse_mercator
- latitude_of_projection_origin :
- 0.0
- longitude_of_central_meridian :
- -105.0
- false_easting :
- 500000.0
- false_northing :
- 0.0
- scale_factor_at_central_meridian :
- 0.9996
- spatial_ref :
- PROJCS["UTM Zone 13, Northern Hemisphere",GEOGCS["Unknown datum based upon the WGS 84 ellipsoid",DATUM["Not_specified_based_on_WGS_84_spheroid",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
- GeoTransform :
- 699960.0 30.0 0.0 4600020.0 0.0 -30.0
array(0)
- _FillValue :
- -9999.0
- scale_factor :
- 0.0001
- add_offset :
- 0.0
- long_name :
- Red
del(hls_da_data)Read in and clip a single HLS file
To clip the HLS file, our feature representing our region of interest must be in the same coordinate reference system (CRS) or projection coordinate system as the HLS file. The map projection for our HLS file is Universal Transverse Mercator (UTM) zone 13N. Our feature is mapped to WGS84 geographic coordinate system grid space. We need to transform the geographic coordinate reference system (CRS) of our feature to the UTM projected coordinate system (i.e., UTM Zone 13N)
Read in our geojson file and transform its CRS
field = geopandas.read_file('./data/ne_w_agfields.geojson')Let’s take a look at the bounding coordinate values.
field_shape = field.geometry[0]
field_shape.bounds(-101.67271614074707, 41.04754380304359, -101.65344715118408, 41.06213891056728)
Note, the values above are in decimal degrees and represent the longitude and latitude for the lower left corner (-101.67271614074707, 41.04754380304359) and upper right corner (-101.65344715118408, 41.06213891056728) respectively.
Get the projection information from the HLS file
hls_proj = hls_da.rio.crs
hls_projCRS.from_epsg(32613)
Transform coordinates from lat lon (units = dd) to UTM (units = m)
geo_CRS = Proj('+proj=longlat +datum=WGS84 +no_defs', preserve_units=True) # Source coordinate system of the ROIproject = pyproj.Transformer.from_proj(geo_CRS, hls_proj) # Set up the transformationfsUTM = transform(project.transform, field_shape)
fsUTM.bounds(779588.4994601272, 4549370.366049466, 781270.1479326887, 4551052.979639321)
The coordinates for our feature have now been converted to UTM Zone 13N whether meters is the designated unit. Note the difference in the values between field_shape.bounds (in geographic) and fsUTM.bounds (in UTM projection).
Now we can clip our HLS file to our region of insterest!
Access and clip the HLS file
We can now use our transformed ROI bounding box to clip the HLS S3 object we accessed before. We’ll use the `rio.clip
hls_da_clip = rioxarray.open_rasterio(s3_link, chunks=True).squeeze('band', drop=True).rio.clip([fsUTM])hls_da_clip<xarray.DataArray (y: 56, x: 56)>
dask.array<astype, shape=(56, 56), dtype=int16, chunksize=(56, 56), chunktype=numpy.ndarray>
Coordinates:
* y (y) float64 4.551e+06 4.551e+06 ... 4.549e+06 4.549e+06
* x (x) float64 7.796e+05 7.796e+05 ... 7.812e+05 7.812e+05
spatial_ref int64 0
Attributes:
scale_factor: 0.0001
add_offset: 0.0
long_name: Red
_FillValue: -9999- y: 56
- x: 56
- dask.array<chunksize=(56, 56), meta=np.ndarray>
Array Chunk Bytes 6.12 kiB 6.12 kiB Shape (56, 56) (56, 56) Count 12 Tasks 1 Chunks Type int16 numpy.ndarray - y(y)float644.551e+06 4.551e+06 ... 4.549e+06
- axis :
- Y
- long_name :
- y coordinate of projection
- standard_name :
- projection_y_coordinate
- units :
- metre
array([4551045., 4551015., 4550985., 4550955., 4550925., 4550895., 4550865., 4550835., 4550805., 4550775., 4550745., 4550715., 4550685., 4550655., 4550625., 4550595., 4550565., 4550535., 4550505., 4550475., 4550445., 4550415., 4550385., 4550355., 4550325., 4550295., 4550265., 4550235., 4550205., 4550175., 4550145., 4550115., 4550085., 4550055., 4550025., 4549995., 4549965., 4549935., 4549905., 4549875., 4549845., 4549815., 4549785., 4549755., 4549725., 4549695., 4549665., 4549635., 4549605., 4549575., 4549545., 4549515., 4549485., 4549455., 4549425., 4549395.]) - x(x)float647.796e+05 7.796e+05 ... 7.812e+05
- axis :
- X
- long_name :
- x coordinate of projection
- standard_name :
- projection_x_coordinate
- units :
- metre
array([779595., 779625., 779655., 779685., 779715., 779745., 779775., 779805., 779835., 779865., 779895., 779925., 779955., 779985., 780015., 780045., 780075., 780105., 780135., 780165., 780195., 780225., 780255., 780285., 780315., 780345., 780375., 780405., 780435., 780465., 780495., 780525., 780555., 780585., 780615., 780645., 780675., 780705., 780735., 780765., 780795., 780825., 780855., 780885., 780915., 780945., 780975., 781005., 781035., 781065., 781095., 781125., 781155., 781185., 781215., 781245.]) - spatial_ref()int640
- crs_wkt :
- PROJCS["UTM Zone 13, Northern Hemisphere",GEOGCS["Unknown datum based upon the WGS 84 ellipsoid",DATUM["Not_specified_based_on_WGS_84_spheroid",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
- semi_major_axis :
- 6378137.0
- semi_minor_axis :
- 6356752.314245179
- inverse_flattening :
- 298.257223563
- reference_ellipsoid_name :
- WGS 84
- longitude_of_prime_meridian :
- 0.0
- prime_meridian_name :
- Greenwich
- geographic_crs_name :
- Unknown datum based upon the WGS 84 ellipsoid
- horizontal_datum_name :
- Not_specified_based_on_WGS_84_spheroid
- projected_crs_name :
- UTM Zone 13, Northern Hemisphere
- grid_mapping_name :
- transverse_mercator
- latitude_of_projection_origin :
- 0.0
- longitude_of_central_meridian :
- -105.0
- false_easting :
- 500000.0
- false_northing :
- 0.0
- scale_factor_at_central_meridian :
- 0.9996
- spatial_ref :
- PROJCS["UTM Zone 13, Northern Hemisphere",GEOGCS["Unknown datum based upon the WGS 84 ellipsoid",DATUM["Not_specified_based_on_WGS_84_spheroid",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
- GeoTransform :
- 779580.0 30.0 0.0 4551060.0 0.0 -30.0
array(0)
- scale_factor :
- 0.0001
- add_offset :
- 0.0
- long_name :
- Red
- _FillValue :
- -9999
hls_da_clip.hvplot.image(x = 'x', y = 'y', crs = 'EPSG:32613', cmap='fire', rasterize=True, width=800, height=600, colorbar=True)Unable to display output for mime type(s):
Read in and clip an HLS time series
Now we’ll read in multiple HLS S3 objects as a time series xarray. Let’s print the links list again to see what we’re working with.
s3_links[‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021133T172406.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021133T173859.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021140T173021.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021140T172859.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021145T172901.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021156T173029.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021163T173909.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021165T172422.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021165T172901.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021185T172901.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021188T173037.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021190T172859.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021198T173911.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021200T172859.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021203T173909.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021204T173042.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2021215T172901.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021220T173049.v1.5.B04.tif’, ‘s3://lp-prod-protected/HLSL30.015/HLS.L30.T13TGF.2021229T172441.v1.5.B04.tif’]
Currently, the utilities and packages used in Python to read in GeoTIFF/COG file do not recognize associated dates stored in the internal metadata. To account for the dates for each file we must create a time variable and add it as a dimension in our final time series xarray. We’ll create a function that extracts the date from the file link and create an xarray variable with a time array of datetime objects.
def time_index_from_filenames(file_links):
'''
Helper function to create a pandas DatetimeIndex
'''
return [datetime.strptime(f.split('.')[-5], '%Y%jT%H%M%S') for f in file_links]time = xr.Variable('time', time_index_from_filenames(s3_links))We’ll now specify a chunk size to use that matches the internal tiling of HLS files. This will help improve performance.
chunks=dict(band=1, x=1024, y=1024)Now, we will create our time series.
hls_ts_da = xr.concat([rioxarray.open_rasterio(f, chunks=chunks).squeeze('band', drop=True) for f in s3_links], dim=time)
hls_ts_da<xarray.DataArray (time: 19, y: 3660, x: 3660)>
dask.array<concatenate, shape=(19, 3660, 3660), dtype=int16, chunksize=(1, 1024, 1024), chunktype=numpy.ndarray>
Coordinates:
* x (x) float64 7e+05 7e+05 7e+05 ... 8.097e+05 8.097e+05 8.097e+05
* y (y) float64 4.6e+06 4.6e+06 4.6e+06 ... 4.49e+06 4.49e+06
spatial_ref int64 0
* time (time) datetime64[ns] 2021-05-13T17:24:06 ... 2021-08-17T17:...
Attributes:
_FillValue: -9999.0
scale_factor: 0.0001
add_offset: 0.0
long_name: Red- time: 19
- y: 3660
- x: 3660
- dask.array<chunksize=(1, 1024, 1024), meta=np.ndarray>
Array Chunk Bytes 485.45 MiB 2.00 MiB Shape (19, 3660, 3660) (1, 1024, 1024) Count 1235 Tasks 304 Chunks Type int16 numpy.ndarray - x(x)float647e+05 7e+05 ... 8.097e+05 8.097e+05
array([699975., 700005., 700035., ..., 809685., 809715., 809745.])
- y(y)float644.6e+06 4.6e+06 ... 4.49e+06
array([4600005., 4599975., 4599945., ..., 4490295., 4490265., 4490235.])
- spatial_ref()int640
- crs_wkt :
- PROJCS["UTM Zone 13, Northern Hemisphere",GEOGCS["Unknown datum based upon the WGS 84 ellipsoid",DATUM["Not_specified_based_on_WGS_84_spheroid",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
- semi_major_axis :
- 6378137.0
- semi_minor_axis :
- 6356752.314245179
- inverse_flattening :
- 298.257223563
- reference_ellipsoid_name :
- WGS 84
- longitude_of_prime_meridian :
- 0.0
- prime_meridian_name :
- Greenwich
- geographic_crs_name :
- Unknown datum based upon the WGS 84 ellipsoid
- horizontal_datum_name :
- Not_specified_based_on_WGS_84_spheroid
- projected_crs_name :
- UTM Zone 13, Northern Hemisphere
- grid_mapping_name :
- transverse_mercator
- latitude_of_projection_origin :
- 0.0
- longitude_of_central_meridian :
- -105.0
- false_easting :
- 500000.0
- false_northing :
- 0.0
- scale_factor_at_central_meridian :
- 0.9996
- spatial_ref :
- PROJCS["UTM Zone 13, Northern Hemisphere",GEOGCS["Unknown datum based upon the WGS 84 ellipsoid",DATUM["Not_specified_based_on_WGS_84_spheroid",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
- GeoTransform :
- 699960.0 30.0 0.0 4600020.0 0.0 -30.0
array(0)
- time(time)datetime64[ns]2021-05-13T17:24:06 ... 2021-08-...
array(['2021-05-13T17:24:06.000000000', '2021-05-13T17:38:59.000000000', '2021-05-20T17:30:21.000000000', '2021-05-20T17:28:59.000000000', '2021-05-25T17:29:01.000000000', '2021-06-05T17:30:29.000000000', '2021-06-12T17:39:09.000000000', '2021-06-14T17:24:22.000000000', '2021-06-14T17:29:01.000000000', '2021-07-04T17:29:01.000000000', '2021-07-07T17:30:37.000000000', '2021-07-09T17:28:59.000000000', '2021-07-17T17:39:11.000000000', '2021-07-19T17:28:59.000000000', '2021-07-22T17:39:09.000000000', '2021-07-23T17:30:42.000000000', '2021-08-03T17:29:01.000000000', '2021-08-08T17:30:49.000000000', '2021-08-17T17:24:41.000000000'], dtype='datetime64[ns]')
- _FillValue :
- -9999.0
- scale_factor :
- 0.0001
- add_offset :
- 0.0
- long_name :
- Red
Since we used the chunks parameter while reading the data, the hls_ts_da object is read into memory. To do that we’ll use the load() function. But, before that, we’ll clip the hls_ts_da object to our roi using our transformed roi coordinates.
hls_ts_da_clip = hls_ts_da.rio.clip([fsUTM]).load()
hls_ts_da_clip<xarray.DataArray (time: 19, y: 56, x: 56)>
array([[[-9999, -9999, -9999, ..., 980, -9999, -9999],
[-9999, -9999, -9999, ..., 287, -9999, -9999],
[ 1573, 1692, 1708, ..., 410, -9999, -9999],
...,
[-9999, -9999, 1165, ..., 1808, 1869, 1906],
[-9999, -9999, 989, ..., -9999, -9999, -9999],
[-9999, -9999, 1085, ..., -9999, -9999, -9999]],
[[-9999, -9999, -9999, ..., 860, -9999, -9999],
[-9999, -9999, -9999, ..., 496, -9999, -9999],
[ 2681, 2773, 2496, ..., 550, -9999, -9999],
...,
[-9999, -9999, 3847, ..., 1997, 1914, 1831],
[-9999, -9999, 4062, ..., -9999, -9999, -9999],
[-9999, -9999, 4313, ..., -9999, -9999, -9999]],
[[-9999, -9999, -9999, ..., 808, -9999, -9999],
[-9999, -9999, -9999, ..., 230, -9999, -9999],
[ 1802, 1828, 1863, ..., 306, -9999, -9999],
...,
...
...,
[-9999, -9999, 1124, ..., 804, 934, 1008],
[-9999, -9999, 1003, ..., -9999, -9999, -9999],
[-9999, -9999, 904, ..., -9999, -9999, -9999]],
[[-9999, -9999, -9999, ..., 1313, -9999, -9999],
[-9999, -9999, -9999, ..., 1327, -9999, -9999],
[ 1091, 1094, 1179, ..., 1223, -9999, -9999],
...,
[-9999, -9999, 1145, ..., 1005, 1097, 1197],
[-9999, -9999, 1037, ..., -9999, -9999, -9999],
[-9999, -9999, 1114, ..., -9999, -9999, -9999]],
[[-9999, -9999, -9999, ..., 1272, -9999, -9999],
[-9999, -9999, -9999, ..., 1231, -9999, -9999],
[ 1086, 1105, 1193, ..., 1205, -9999, -9999],
...,
[-9999, -9999, 1045, ..., 1049, 1142, 1219],
[-9999, -9999, 926, ..., -9999, -9999, -9999],
[-9999, -9999, 1076, ..., -9999, -9999, -9999]]], dtype=int16)
Coordinates:
* y (y) float64 4.551e+06 4.551e+06 ... 4.549e+06 4.549e+06
* x (x) float64 7.796e+05 7.796e+05 ... 7.812e+05 7.812e+05
* time (time) datetime64[ns] 2021-05-13T17:24:06 ... 2021-08-17T17:...
spatial_ref int64 0
Attributes:
scale_factor: 0.0001
add_offset: 0.0
long_name: Red
_FillValue: -9999- time: 19
- y: 56
- x: 56
- -9999 -9999 -9999 -9999 -9999 -9999 ... -9999 -9999 -9999 -9999 -9999
array([[[-9999, -9999, -9999, ..., 980, -9999, -9999], [-9999, -9999, -9999, ..., 287, -9999, -9999], [ 1573, 1692, 1708, ..., 410, -9999, -9999], ..., [-9999, -9999, 1165, ..., 1808, 1869, 1906], [-9999, -9999, 989, ..., -9999, -9999, -9999], [-9999, -9999, 1085, ..., -9999, -9999, -9999]], [[-9999, -9999, -9999, ..., 860, -9999, -9999], [-9999, -9999, -9999, ..., 496, -9999, -9999], [ 2681, 2773, 2496, ..., 550, -9999, -9999], ..., [-9999, -9999, 3847, ..., 1997, 1914, 1831], [-9999, -9999, 4062, ..., -9999, -9999, -9999], [-9999, -9999, 4313, ..., -9999, -9999, -9999]], [[-9999, -9999, -9999, ..., 808, -9999, -9999], [-9999, -9999, -9999, ..., 230, -9999, -9999], [ 1802, 1828, 1863, ..., 306, -9999, -9999], ..., ... ..., [-9999, -9999, 1124, ..., 804, 934, 1008], [-9999, -9999, 1003, ..., -9999, -9999, -9999], [-9999, -9999, 904, ..., -9999, -9999, -9999]], [[-9999, -9999, -9999, ..., 1313, -9999, -9999], [-9999, -9999, -9999, ..., 1327, -9999, -9999], [ 1091, 1094, 1179, ..., 1223, -9999, -9999], ..., [-9999, -9999, 1145, ..., 1005, 1097, 1197], [-9999, -9999, 1037, ..., -9999, -9999, -9999], [-9999, -9999, 1114, ..., -9999, -9999, -9999]], [[-9999, -9999, -9999, ..., 1272, -9999, -9999], [-9999, -9999, -9999, ..., 1231, -9999, -9999], [ 1086, 1105, 1193, ..., 1205, -9999, -9999], ..., [-9999, -9999, 1045, ..., 1049, 1142, 1219], [-9999, -9999, 926, ..., -9999, -9999, -9999], [-9999, -9999, 1076, ..., -9999, -9999, -9999]]], dtype=int16) - y(y)float644.551e+06 4.551e+06 ... 4.549e+06
- axis :
- Y
- long_name :
- y coordinate of projection
- standard_name :
- projection_y_coordinate
- units :
- metre
array([4551045., 4551015., 4550985., 4550955., 4550925., 4550895., 4550865., 4550835., 4550805., 4550775., 4550745., 4550715., 4550685., 4550655., 4550625., 4550595., 4550565., 4550535., 4550505., 4550475., 4550445., 4550415., 4550385., 4550355., 4550325., 4550295., 4550265., 4550235., 4550205., 4550175., 4550145., 4550115., 4550085., 4550055., 4550025., 4549995., 4549965., 4549935., 4549905., 4549875., 4549845., 4549815., 4549785., 4549755., 4549725., 4549695., 4549665., 4549635., 4549605., 4549575., 4549545., 4549515., 4549485., 4549455., 4549425., 4549395.]) - x(x)float647.796e+05 7.796e+05 ... 7.812e+05
- axis :
- X
- long_name :
- x coordinate of projection
- standard_name :
- projection_x_coordinate
- units :
- metre
array([779595., 779625., 779655., 779685., 779715., 779745., 779775., 779805., 779835., 779865., 779895., 779925., 779955., 779985., 780015., 780045., 780075., 780105., 780135., 780165., 780195., 780225., 780255., 780285., 780315., 780345., 780375., 780405., 780435., 780465., 780495., 780525., 780555., 780585., 780615., 780645., 780675., 780705., 780735., 780765., 780795., 780825., 780855., 780885., 780915., 780945., 780975., 781005., 781035., 781065., 781095., 781125., 781155., 781185., 781215., 781245.]) - time(time)datetime64[ns]2021-05-13T17:24:06 ... 2021-08-...
array(['2021-05-13T17:24:06.000000000', '2021-05-13T17:38:59.000000000', '2021-05-20T17:30:21.000000000', '2021-05-20T17:28:59.000000000', '2021-05-25T17:29:01.000000000', '2021-06-05T17:30:29.000000000', '2021-06-12T17:39:09.000000000', '2021-06-14T17:24:22.000000000', '2021-06-14T17:29:01.000000000', '2021-07-04T17:29:01.000000000', '2021-07-07T17:30:37.000000000', '2021-07-09T17:28:59.000000000', '2021-07-17T17:39:11.000000000', '2021-07-19T17:28:59.000000000', '2021-07-22T17:39:09.000000000', '2021-07-23T17:30:42.000000000', '2021-08-03T17:29:01.000000000', '2021-08-08T17:30:49.000000000', '2021-08-17T17:24:41.000000000'], dtype='datetime64[ns]') - spatial_ref()int640
- crs_wkt :
- PROJCS["UTM Zone 13, Northern Hemisphere",GEOGCS["Unknown datum based upon the WGS 84 ellipsoid",DATUM["Not_specified_based_on_WGS_84_spheroid",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
- semi_major_axis :
- 6378137.0
- semi_minor_axis :
- 6356752.314245179
- inverse_flattening :
- 298.257223563
- reference_ellipsoid_name :
- WGS 84
- longitude_of_prime_meridian :
- 0.0
- prime_meridian_name :
- Greenwich
- geographic_crs_name :
- Unknown datum based upon the WGS 84 ellipsoid
- horizontal_datum_name :
- Not_specified_based_on_WGS_84_spheroid
- projected_crs_name :
- UTM Zone 13, Northern Hemisphere
- grid_mapping_name :
- transverse_mercator
- latitude_of_projection_origin :
- 0.0
- longitude_of_central_meridian :
- -105.0
- false_easting :
- 500000.0
- false_northing :
- 0.0
- scale_factor_at_central_meridian :
- 0.9996
- spatial_ref :
- PROJCS["UTM Zone 13, Northern Hemisphere",GEOGCS["Unknown datum based upon the WGS 84 ellipsoid",DATUM["Not_specified_based_on_WGS_84_spheroid",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
- GeoTransform :
- 779580.0 30.0 0.0 4551060.0 0.0 -30.0
array(0)
- scale_factor :
- 0.0001
- add_offset :
- 0.0
- long_name :
- Red
- _FillValue :
- -9999
Now, we’ll see what we have. Use hvplot to plot the clipped time series
hls_ts_da_clip.hvplot.image(x='x', y='y', width=800, height=600, colorbar=True, cmap='fire').opts(clim=(hls_ts_da_clip.values.min(), hls_ts_da_clip.values.max()))# Exit our context
rio_env.__exit__()Resourses
- Build time series from multiple GeoTIFF files
- https://git.earthdata.nasa.gov/projects/LPDUR/repos/lpdaac_cloud_data_access/browse
- https://git.earthdata.nasa.gov/projects/LPDUR/repos/hls-tutorial/browse
Rough PODAAC ECCO SSH Example
s3_cred_endpoint = {
'podaac':'https://archive.podaac.earthdata.nasa.gov/s3credentials',
'lpdaac':'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials'
}from pystac_client import Client podaac_cat = Client.open('https://cmr.earthdata.nasa.gov/stac/POCLOUD/')search = podaac_cat.search(
collections=['ECCO_L4_SSH_05DEG_MONTHLY_V4R4'],
datetime='2015'
)search.matched()items = search.get_all_items()
list(items)ssh_https = items[1].get_assets()['data'].href
ssh_httpsssh_s3 = ssh_https.replace('https://archive.podaac.earthdata.nasa.gov/', 's3://')
ssh_s3import rasterio as rio
import rioxarrayssh = rioxarray.open_rasterio(ssh_s3, )ssh[1].squeeze('band', drop=True)ssh[0].SSHRead In and Process STAC Asset Links
In the previous section, we used the NASA CMR-STAC API to discover HLS assets the intersect with our search criteria, i.e., ROI, Date range, and collections. The search results were filtered and saved as text files by individual bands for each tile. We will read in the text files for tile T13TGF for the RED (L30: B04 & S30: B04), NIR (L30: B05 & S30: B8A), and Fmask bands.
List text files with HLS links
[t for t in os.listdir('./data') if '.txt' in t]Read in our asset links for BO4 (RED)
red_s3_links = open('./data/S3_T12TGF_B04_Links.txt').read().splitlines()
red_s3_linksRead in and combine our asset links for BO5 (Landsat NIR) and B8A (Sentinel-2 NIR)
The near-infrared (NIR) band for Landsat is B05 while the NIR band for Sentinel-2 is B8A. In the next step we will read in and combine the lists into a single NIR list.
nir_bands = ['B05', 'B8A']
nir_link_text = [x for x in os.listdir('./data') if any(b in x for b in nir_bands) and 'S3' in x]
nir_s3_links = []
for file in nir_link_text:
nir_s3_links.extend(open(f'./data/{file}').read().splitlines())
nir_s3_linksRead in our asset links for Fmask
fmask_s3_links = open('./data/S3_T12TGF_Fmask_Links.txt').read().splitlines()
#fmask_s3_linksIn this example we will use the gdalbuildvrt.exe utility to create a time series virtual raster format (VRT) file. The utility, however, expects the links to be formated with the GDAL virtual file system (VSI) path, rather than the actual asset links. We will therefore use the VSI path to access our assets. The examples below show the VSI path substitution for S3 (vsis3) links.
/vsis3/lp-prod-protected/HLSS30.015/HLS.S30.T13TGF.2020191T172901.v1.5.B04.tif
See the GDAL Virtual File Systems for more information regarding GDAL VSI.
Write out a new text file containing the vsis3 path
with open('./data/S3_T12TGF_RED_VSI_Links.txt', 'w') as f:
links_vsi = [r.replace('s3://', '/vsis3/' ) + '\n' for r in red_s3_links]
for link in links_vsi:
f.write(link)with open('./data/S3_T12TGF_NIR_VSI_Links.txt', 'w') as f:
links_vsi = [r.replace('s3://', '/vsis3/' ) + '\n' for r in nir_s3_links]
for link in links_vsi:
f.write(link)with open('./data/S3_T12TGF_FMASK_VSI_Links.txt', 'w') as f:
links_vsi = [r.replace('s3://', '/vsis3/' ) + '\n' for r in fmask_s3_links]
for link in links_vsi:
f.write(link)Read in geoJSON for subsetting
We will use the input geoJSON file to clip the source data to our desired region of interest.
field = geopandas.read_file('./data/ne_w_agfields.geojson')
fieldShape = field['geometry'][0] To clip the source data to our input feature boundary, we need to transform the feature boundary from its original WGS84 coordinate reference system to the projected reference system of the source HLS file (i.e., UTM Zone 13).
foa_url = red_s3_links[0]
with rio.open(foa_url) as src:
hls_proj = src.crs.to_string()
hls_proj Transform geoJSON feature from WGS84 to UTM
geo_CRS = Proj('+proj=longlat +datum=WGS84 +no_defs', preserve_units=True) # Source coordinate system of the ROI
project = pyproj.Transformer.from_proj(geo_CRS, hls_proj) # Set up the transformation
fsUTM = transform(project.transform, fieldShape)Direct S3 Data Access
Start up a dask client
#from dask.distributed import Client#client = Client(n_workers=2)
#clientThere are multiple way to read COG data in as a time series. The subprocess package is used in this example to run GDAL’s build virtual raster file (gdalbuildvrt) executable outside our python session. First we’ll need to construct a string object with the command and it’s parameter parameters (including our temporary credentials). Then, we run the command using the subprocess.call() function.
Build GDAL VRT Files
Construct the GDAL VRT call
build_red_vrt = f"gdalbuildvrt ./data/red_stack.vrt -separate -input_file_list ./data/S3_T12TGF_RED_VSI_Links.txt --config AWS_ACCESS_KEY_ID {temp_creds_req['accessKeyId']} --config AWS_SECRET_ACCESS_KEY {temp_creds_req['secretAccessKey']} --config AWS_SESSION_TOKEN {temp_creds_req['sessionToken']} --config GDAL_DISABLE_READDIR_ON_OPEN TRUE"
#build_red_vrt # !!! BEWARE, removing the # on this line will print your temporary S3 credentials.We now have a fully configured gdalbuildvrt string that we can pass to Python’s subprocess module to run the gdalbuildvrt executable outside our Python environment.
Execute gdalbuildvrt to construct a VRT on disk from the S3 links
%%time
subprocess.call(build_red_vrt, shell=True)0 means success! We’ll have some troubleshooting to do you get any other value. In this tutorial, the path for the output VRT file or the input file list are the first things to check.
While we’re here, we’ll build the VRT files for the NIR layers and the Fmask layers.
build_nir_vrt = f"gdalbuildvrt ./data/nir_stack.vrt -separate -input_file_list ./data/S3_T12TGF_NIR_VSI_Links.txt --config AWS_ACCESS_KEY_ID {temp_creds_req['accessKeyId']} --config AWS_SECRET_ACCESS_KEY {temp_creds_req['secretAccessKey']} --config AWS_SESSION_TOKEN {temp_creds_req['sessionToken']} --config GDAL_DISABLE_READDIR_ON_OPEN TRUE"
subprocess.call(build_nir_vrt, shell=True)build_fmask_vrt = f"gdalbuildvrt ./data/fmask_stack.vrt -separate -input_file_list ./data/S3_T12TGF_FMASK_VSI_Links.txt --config AWS_ACCESS_KEY_ID {temp_creds_req['accessKeyId']} --config AWS_SECRET_ACCESS_KEY {temp_creds_req['secretAccessKey']} --config AWS_SESSION_TOKEN {temp_creds_req['sessionToken']} --config GDAL_DISABLE_READDIR_ON_OPEN TRUE"
subprocess.call(build_fmask_vrt, shell=True)Reading in an HLS time series
We can now read the VRT files into our Python session. A drawback of reading VRTs into Python is that the time coordinate variable needs to be contructed. Below we not only read in the VRT file using rioxarray, but we also repurpose the band variable, which is generated automatically, to hold out time information.
Read the RED VRT in as xarray with Dask backing
%%time
chunks=dict(band=1, x=1024, y=1024)
#chunks=dict(band=1, x=512, y=512)
red = rioxarray.open_rasterio('./data/red_stack.vrt', chunks=chunks) # Read in VRT
red = red.rename({'band':'time'}) # Rename the 'band' coordinate variable to 'time'
red['time'] = [datetime.strptime(x.split('.')[-5], '%Y%jT%H%M%S') for x in links_vsi] # Extract the time information from the input file names and assign them to the time coordinate variable
red = red.sortby('time') # Sort by the time coordinate variable
redAbove we use the parameter chunk in the rioxarray.open_rasterio() function to enable the Dask backing. What this allows is lazy reading of the data, which means the data is not actually read in into memory at this point. What we have is an object with some metadata and pointer to the source data. The data will be streamed to us when we call for it, but not stored in memory until with call the Dask compute() or persist() methods.
Print out the time coordinate
red.timeClip out the ROI and persist the result in memory
Up until now, we haven’t read any of the HLS data into memory. Now we will use the persist() method to load the data into memory.
red_clip = red.rio.clip([fsUTM]).persist()
red_clipAbove, we persisted the clipped results to memory using the persist() method. This doesn’t necessarily need to be done, but it will substantially improve the performance of the visualization of the time series below.
Plot red_clip with hvplot
red_clip.hvplot.image(x='x', y='y', width=800, height=600, colorbar=True, cmap='Reds').opts(clim=(0.0, red_clip.values.max()))Read in the NIR and Fmask VRT files
%%time
chunks=dict(band=1, x=1024, y=1024)
nir = rioxarray.open_rasterio('./data/nir_stack.vrt', chunks=chunks) # Read in VRT
nir = nir.rename({'band':'time'}) # Rename the 'band' coordinate variable to 'time'
nir['time'] = [datetime.strptime(x.split('.')[-5], '%Y%jT%H%M%S') for x in links_vsi] # Extract the time information from the input file names and assign them to the time coordinate variable
nir = nir.sortby('time') # Sort by the time coordinate variable
nir%%time
chunks=dict(band=1, x=1024, y=1024)
fmask = rioxarray.open_rasterio('./data/fmask_stack.vrt', chunks=chunks) # Read in VRT
fmask = fmask.rename({'band':'time'}) # Rename the 'band' coordinate variable to 'time'
fmask['time'] = [datetime.strptime(x.split('.')[-5], '%Y%jT%H%M%S') for x in links_vsi] # Extract the time information from the input file names and assign them to the time coordinate variable
fmask = fmask.sortby('time') # Sort by the time coordinate variable
fmaskCreate an xarray dataset
We will now combine the RED, NIR, and Fmask arrays into a dataset and create/add a new NDVI variable.
hls_ndvi = xr.Dataset({'red': red, 'nir': nir, 'fmask': fmask, 'ndvi': (nir - red) / (nir + red)})
hls_ndviAbove, we created a new NDVI variable. Now, we will clip and plot our results.
ndvi_clip = hls_ndvi.ndvi.rio.clip([fsUTM]).persist()
ndvi_clipPlot NDVI
ndvi_clip.hvplot.image(x='x', y='y', groupby='time', width=800, height=600, colorbar=True, cmap='YlGn').opts(clim=(0.0, 1.0))You may have notices that some images for some of the time step are ‘blurrier’ than other. This is because they are contaminated in some way, be it clouds, cloud shadows, snow, ice.
Apply quality filter
We want to keep NDVI data values where Fmask equals 0 (no clouds, no cloud shadow, no snow/ice, no water.
ndvi_clip_filter = hls_ndvi.ndvi.where(fmask==0, np.nan).rio.clip([fsUTM]).persist()ndvi_clip_filter.hvplot.image(x='x', y='y', groupby='time', width=800, height=600, colorbar=True, cmap='YlGn').opts(clim=(0.0, 1.0))Aggregate by month
Finally, we will use xarray’s groupby operation to aggregate by month.
ndvi_clip_filter.groupby('time.month').mean('time').hvplot.image(x = 'x', y = 'y', crs = hls_proj, groupby='month', cmap='YlGn', width=800, height=600, colorbar=True).opts(clim=(0.0, 1.0))rio_env.__exit__()References
- https://rasterio.readthedocs.io/en/latest/
- https://corteva.github.io/rioxarray/stable/index.html
- https://tutorial.dask.org/index.html
- https://examples.dask.org/applications/satellite-imagery-geotiff.html